Three Strategies for Concurrent Processing of Frequent Itemset Queries Using FP-Growth
نویسندگان
چکیده
Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. Recently, a new problem of optimizing processing of sets of frequent itemset queries has been considered and two multiple query optimization techniques for frequent itemset queries: Mine Merge and Common Counting have been proposed and tested on the Apriori algorithm. In this paper we discuss and experimentally evaluate three strategies for concurrent processing of frequent itemset queries using FP-growth as a basic frequent itemset mining algorithm. The first strategy is Mine Merge, which does not depend on a particular mining algorithm and can be applied to FP-growth without modifications. The second is an implementation of the general idea of Common Counting for FP-growth. The last is a completely new strategy, motivated by identified shortcomings of the previous two strategies in the context of FP-growth.
منابع مشابه
Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm
Discovery of frequent itemsets is a very important data mining problem with numerous applications. Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. A significant amount of research on frequent itemset mining has been done so far, focusing mainly on developing faster complete mining al...
متن کاملIntegration of candidate hash trees in concurrent processing of frequent itemset queries using Apriori
In this paper we address the problem of processing of batches of frequent itemset queries using the Apriori algorithm. The best solution of this problem proposed so far is Common Counting, which consists in concurrent execution of the queries using Apriori with the integration of scans of the parts of the database shared among the queries. In this paper we propose a new method – Common Candidat...
متن کاملControl and Cybernetics Integration of Candidate Hash Trees in Concurrent Processing of Frequent Itemset Queries Using Apriori *
Abstract: Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. In this paper we address the problem of processing batches of frequent itemset queries using the Apriori algorithm. The best solution of this problem proposed so far is Common Counting, which consists in concurrent execution o...
متن کاملAccelerating Closed Frequent Itemset Mining by Elimination of Null Transactions
The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...
متن کاملProbabilistic Frequent Pattern Growth for Itemset Mining in Uncertain Databases (Technical Report)
Frequent itemset mining in uncertain transaction databases semantically and computationally differs from traditional techniques applied on standard (certain) transaction databases. Uncertain transaction databases consist of sets of existentially uncertain items. The uncertainty of items in transactions makes traditional techniques inapplicable. In this paper, we tackle the problem of finding pr...
متن کامل